194 research outputs found

    Integrating snp data and imputation methods into the DNA methylation analysis framework

    Get PDF
    DNA methylation is a widely studied epigenetic modification that can influence the expression and regulation of functional genes, especially those related to aging, cancer and other diseases. The common goal of methylation studies is to find differences in methylation levels between samples collected under different conditions. Differences can be detected at the site level, but regulated methylation targets are most commonly clustered into short regions. Thus, identifying differentially methylated regions (DMRs) between different groups is of prime interest. Despite advanced technology that enables measuring methylation genome-wide, misinterpretations in the readings can arise due to the existence of single nucleotide polymorphisms (SNPs) in the target sequence. One of the main pre-processing steps in DMR detection methods involves filtering out potential SNP-related probes due to this issue. In this work, it is proposed to leverage the current trend of collecting both SNP and methylation data on the same individual, making it possible to integrate SNP data into the DNA methylation analysis framework. This will enable the originally filtered potential SNPs to be restored if a SNP is not actually present. Furthermore, when a SNP is present or other missing data issues arise, imputation methods are proposed for methylation data. First, regularized linear regression (ridge, LASSO and elastic net) imputation models are proposed, along with a variable screening technique to restrict the number of variables in the models. Functional principal component regression imputation is also proposed as an alternative approach. The proposed imputation methods are compared to existing methods and evaluated based on imputation accuracy and DMR detection ability using both real and simulated data. One of the proposed methods (elastic net with variable screening) shows effective imputation accuracy without sacrificing computation efficiency across a variety of settings, while greatly improving the number of true positive DMR detections --Abstract, page iii

    Over-Confidence and Cycles in Real Estate Markets: Cases in Hong Kong and Asia

    Get PDF
    Studies on the calibration of subjective probabilities find that people tend to over-estimate the precision of their knowledge. In this paper we develop a semi-rational model and apply it to the real estate markets in Hong Kong and other Asian countries. The key point is that a person is rational about her/his private information until her/his private information is confirmed by a clearly defined market signal. Using a pre-sale as a mechanism of updating a developer's beliefs, this paper analyzes the impact of over-confidence on overbuilding and cycles in real estate markets. Our finding indicates that a pre-sale activity will increase the magnitude of over-building and over-confidence will increase the volatility in real estate markets. Our model also has implications to the well-established literature dealing with the issue of over-capacity in many industrial sectors.

    Melt Pond Retrieval Based on the LinearPolar Algorithm Using Landsat Data

    Get PDF
    The formation and distribution of melt ponds have an important influence on the Arctic climate. Therefore, it is necessary to obtain more accurate information on melt ponds on Arctic sea ice by remote sensing. The present large-scale melt pond products, especially the melt pond fraction (MPF), still require verification, and using very high resolution optical satellite remote sensing data is a good way to verify the large-scale retrieval of MPF products. Unlike most MPF algorithms using very high resolution data, the LinearPolar algorithm using Sentinel-2 data considers the albedo of melt ponds unfixed. In this paper, by selecting the best band combination, we applied this algorithm to Landsat 8 (L8) data. Moreover, Sentinel-2 data, as well as support vector machine (SVM) and iterative self-organizing data analysis technique (ISODATA) algorithms, are used as the comparison and verification data. The results show that the recognition accuracy of the LinearPolar algorithm for melt ponds is higher than that of previous algorithms. The overall accuracy and kappa coefficient results achieved by using the LinearPolar algorithm with L8 and Sentinel-2A (S2), the SVM algorithm, and the ISODATA algorithm are 95.38% and 0.88, 94.73% and 0.86, and 92.40%and 0.80, respectively, which are much higher than those of principal component analysis (PCA) and Markus algorithms. The mean MPF (10.0%) obtained from 80 cases from L8 data based on the LinearPolar algorithm is much closer to Sentinel-2 (10.9%) than the Markus (5.0%) and PCA algorithms (4.2%), with a mean MPF difference of only 0.9%, and the correlation coefficients of the two MPFs are as high as 0.95. The overall relative error of the LinearPolar algorithm is 53.5% and 46.4% lower than that of the Markus and PCA algorithms, respectively, and the root mean square error (RMSE) is 30.9% and 27.4% lower than that of the Markus and PCA algorithms, respectively

    DIFFERENTIAL METHYLATION METHODS IN MULTI-CONTEXT ORGANISMS

    Get PDF
    DNA methylation is an epigenetic modification that has the ability to alter gene expression without any change in the DNA sequence. DNA methylation occurs when a methyl chemical group attaches to cytosine bases on the DNA sequence. In mammals, DNA methylation primarily occurs at CG sites, when a cytosine is followed by a guanine in the DNA sequence. In plants, DNA methylation can also occur in other cytosine sequences, such as when a cytosine is not followed directly by a guanine. Many of the statistical methods that have been developed to estimate methylation levels and test differential methylation in whole-genome bisulfite sequencing studies incorporate the observed correlation between methylation levels of neighboring cytosine sites. However, most of these methods have been applied to human studies, where only CG sites are investigated. In this study, we focus on plant studies and show that the correlation between methylation levels at neighboring sites depends on the DNA sequence immediately following the cytosine. We investigate the importance of accounting for these differences in the correlation structure by comparing the performance of three existing methods (MethylSig, MAGI, and M3D) in plants

    Inline Text Entry On Portable Electronic Devices

    Get PDF
    This publication describes systems and techniques to provide inline text entry on portable electronic devices. Portable electronic devices, such as smartphones, generally include an on-screen keyboard to allow users to input alphanumeric characters. These keyboards generally provide several suggestions of the word that the user is currently typing or the next word to be input. Because the keyboard has a limited area on the graphical user interface (GUI) to display candidate words, the keyboard can only present a few suggestions (e.g., two or three candidates), which are generally single-word candidates. This publication describes a keyboard for portable electronic devices that displays inline candidate words, which can include multiple words, entire phrases, and complete sentences. The inline suggestions can be shown directly in the editor box of an application or a pop-up window. The inline suggestions allow users to type faster and reduce spelling and grammatical errors in applications on portable electronic devices

    A Survey on Automated Program Repair Techniques

    Full text link
    With the rapid development and large-scale popularity of program software, modern society increasingly relies on software systems. However, the problems exposed by software have also come to the fore. Software defect has become an important factor troubling developers. In this context, Automated Program Repair (APR) techniques have emerged, aiming to automatically fix software defect problems and reduce manual debugging work. In particular, benefiting from the advances in deep learning, numerous learning-based APR techniques have emerged in recent years, which also bring new opportunities for APR research. To give researchers a quick overview of APR techniques' complete development and future opportunities, we revisit the evolution of APR techniques and discuss in depth the latest advances in APR research. In this paper, the development of APR techniques is introduced in terms of four different patch generation schemes: search-based, constraint-based, template-based, and learning-based. Moreover, we propose a uniform set of criteria to review and compare each APR tool, summarize the advantages and disadvantages of APR techniques, and discuss the current state of APR development. Furthermore, we introduce the research on the related technical areas of APR that have also provided a strong motivation to advance APR development. Finally, we analyze current challenges and future directions, especially highlighting the critical opportunities that large language models bring to APR research.Comment: This paper's earlier version was submitted to CSUR in August 202

    Composite functional module inference: detecting cooperation between transcriptional regulation and protein interaction by mantel test

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Functional modules are basic units of cell function, and exploring them is important for understanding the organization, regulation and execution of cell processes. Functional modules in single biological networks (e.g., the protein-protein interaction network), have been the focus of recent studies. Functional modules in the integrated network are composite functional modules, which imply the complex relationships involving multiple biological interaction types, and detect them will help us understand the complexity of cell processes.</p> <p>Results</p> <p>We aimed to detect composite functional modules containing co-transcriptional regulation interaction, and protein-protein interaction, in our pre-constructed integrated network of <it>Saccharomyces cerevisiae</it>. We computationally extracted 15 composite functional modules, and found structural consistency between co-transcriptional regulation interaction sub-network and protein-protein interaction sub-network that was well correlated with their functional hierarchy. This type of composite functional modules was compact in structure, and was found to participate in essential cell processes such as oxidative phosphorylation and RNA splicing.</p> <p>Conclusions</p> <p>The structure of composite functional modules containing co-transcriptional regulation interaction, and protein-protein interaction reflected the cooperation of transcriptional regulation and protein function implementation, and was indicative of their important roles in essential cell functions. In addition, their structural and functional characteristics were closely related, and suggesting the complexity of the cell regulatory system.</p

    Marked disability and high use of nonsteroidal antiinflammatory drugs associated with knee osteoarthritis in rural China: a cross-sectional population-based survey

    Get PDF
    Introduction: The burden of disability, analgesia, and health services use associated with knee pain and osteoarthritis (OA) in developing countries is relatively unknown, despite a high proportion of these populations required to be engaged in heavy occupational physical activity throughout their life span. The aim of this survey was to estimate the burden of disability, analgesia, and health services use associated with knee pain in rural China. Methods: This was a population-based cross-sectional survey among residents, aged 50 years and older, of Wuchuan County, Inner Mongolia. Participants completed an interviewer-based questionnaire, evaluating knee pain and associated disability, analgesia, and health services use, and obtained bilateral standardized weight-bearing knee radiographs. Results: Of the 1,027 participants, 513 (50%) reported knee pain on most days of at least 1 month in the past year, with 109 (21%) also demonstrating radiographic OA (Kellgren-Lawrence grade &gt;= 2) in the symptomatic knee. Adjusting for age, gender, body mass index (BMI), education, and back pain, the presence of knee pain was associated with significantly greater difficulty in walking, climbing 10 steps, stooping, completing cleaning chores, and preparing meals. Among the 513 subjects with knee pain, the additional presence of radiographic evidence of OA was significantly associated with more occasions of &quot;unbearable&quot; pain (59% versus 36%) and restricted activity (64% versus 39%), as well as increased use of nonsteroidal antiinflammatory drugs (NSAIDs) (88% versus 78%) and the reported number of doctor visits (59% versus 33%) in the past year. The use of paracetamol for knee pain was rare (6% versus 2%). Conclusions: Knee pain is highly prevalent in rural northern China. The associated significant disability and marked preferential use of NSAIDs as analgesia should be of concern in these communities reliant on heavy occupational physical activity for their livelihood. The findings will be useful to guide the distribution of future health care resources and preventive strategies. A similar article has been published in the Chinese language journal, National Medical Journal of China.RheumatologySCI(E)PubMed0ARTICLE6R2251
    • …
    corecore